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ABSTRACT 

Tests  for  comparing  the  strength  of  association  between  a 

Y  -- 

variable  and  each  of  two  potential  predictor  variables  ^ 

and  are  proposed  and  examined  in  a  simulation  study.  The 
variances  of  X^  ^and  the  correlation  between  X^  and  X^.  are 

nuisance  parameters.  A  simple  modification  of  a  test  proposed  by 
Williams  (1959)  Is  found  to  have  good  properties  for  a  wide  range 
of  parameter  values  and  both  normal  and  nonnormal  distributions..^  , 

INTRODDCTION 

In  a  number  of  statistical  settings,  particularly  In  regres 
it  Is  desirable  to  know  which  of  two  random  variables,  say  X2an 
is  more  strongly  correlated  with  a  dependent  random  variable  X^. 

Under  the  assumption  that  the  observations  are  from  a  trlvariate 
normal  distribution,  a  number  of  tests  for  the  hypothesis  0^2  *  P 
have  been  proposed.  These  have  been  analyzed  and  compared  In  some 
detail  by  Neill  and  Dunn  (1975) . 

Further  proposals  have  been  made  for  a  much  more  general  setting 
where  the  underlying  distribution  cannot  be  regarded  as  normal  and 
where  the  measure  of  strength  of  the  relationship  between  the  depen¬ 
dent  and  Independent  variables  may  be  different  than  the  Pearson 
product  moment  correlation  coefficient.  Hubert  and  Golledge  (1981) 


also  discuss  the  situation  where  no  specific  population  model  Is 
obvious. 


Our  Intent  Is  to  examine  a  number  of  such  suggestions,  compare 
them  with  the  procedures  recommended  In  Neill  and  Dunn  for  the  trl- 
varlate  normal  situation,  observe  their  behavior  under  nonnormal 
distributions,  and  draw  conclusions  about  their  relative  merits. 


HISTORY 


Let  X^,  X^,  X^  have  a  continuous  trlvarlate  distribution  with 
covariance  matrix  I.  Let  o^  ■  element  of  Z, 

with  -  1(1  ■  1,2,3).  For  the  trlvarlate  normal,  proposals  for 
testing  Hq:  p^2  *  ^&ve  been  available  since  1940,  when  Hotelling 
proposed  as  a  test  statistic  the  difference  r22  -  (where  r^^^  is 
the  appropriate  sample  correlation  coefficient)  divided  by  an  esti¬ 
mate  of  the  asymptotic  standard  derivation  of  r^2  ~  ^3^3*  and 

Dunn  (1975)  use  both  analytic  methods  and  simulations  In  comparing 
eleven  different  test  statistics  including  Hotelling's  for  this  parti¬ 
cular  situation  and  recommend  a  statistic  proposed  by  Williams  (1959) 
as  the  best  choice  for  small  to  moderate  sample  sizes. 

Williams'  test  statistic,  which  also  relies  on  a  standardized 
version  of  ~  ^^13  only  slightly  different  than  Hotelling's 

proposal.  Is  given  by 
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where 


The  one-tailed  test  compares  T  to  the  upper  percentile  of  the 
t-dlstrlbutlon  on  n-3  degrees  of  freedom. 

As  discussed  In  Boyer  and  Schucany  (1978)  a  distribution  free 
approach  to  this  problem  relies  on  observations  by  Wolfe  (1976) 
that  the  correlation  between  and  Z  *>  X2  ~  Is  given  by 


'’12V2  -  °UV3 


and  thus,  if  02  *  ^^3*  ^12  *  ®  ’^12  “  ‘’l3* 

restriction  that  X2  and  X^  have  the  same  scale  Is  also  needed  for 
Kendall's  ^  0  to  imply  Pj^2  ^  Pi3' 

A  number  of  proposals  for  test  statistics  make  use  of  this 
requirement  by  replacing  sample  values  of  X2  and  X^  with  a  set  of 
scores  that  will  circumvent  the  scale  problem.  One  possibility  Is 
to  replace  each  of  the  elements  of  the  X2  and  X^  vectors  by  their 
integer  ranks,  R(X2^)  and  R(X^^)  respectively.  A  problem  that 
arises  here  is  that  2^  -  ~  Involve  a  sub¬ 

stantial  number  of  tied  values.  An  additional  possibility  Is 
replacing  X2^  and  X^^  by  their  expected  normal  scores,  N(X2^) 
and  N(X^^) .  This  will  reduce  the  magnitude  ot  the  tie  problem 
but  will  still  eliminate  the  scale  difficulties.  Then,  any  of 
the  usual  nonparametrlc  measures  of  correlation  could  be  used  to 


detect  association  between  the  and  Z*  ■  N(X2^)  -  NCX^^).  Note 
that  there  does  not  appear  to  be  a  famllar  population  quantity  corre¬ 
sponding  to  the  relationship  between  X^^  and  t*i  nevertheless  the 
technique  does  allow  one  to  make  general  inferences  about  the 
relationships  of  X^  to  X2  and  X^* 

A  second  class  of  procedures  that  could  be  used  to  test  the 
hypothesis  of  Interest  would  Involve  transformation  of  the  observa¬ 
tions  for  each  of  X^«  X2  and  X^  so  that  they  are  somewhat  normal 
and  Chen  apply  one  of  the  normal  theory  methods  (probably  Williams' 
test)  to  Che  transformed  data.  It  is  felt  that  in  nonnormal  situations 
this  procedure  will  yield  a  test  statistic  that  is  more  stable  than 
just  using  Williams'  test  on  the  raw  data. 

Several  tests  utilizing  one  of  these  methods  or  a  simple 
extension  of  one  of  them  provided  a  starting  place  for  a  sub¬ 
stantial  simulation  study  for  comparison.  A  ninnber  of  additional 
procedures,  as  described  in  Boyer  and  Schucany  (1978)  were  also 
used  in  the  early  stages  of  the  investigation,  but  proved  to  be 
Inadequate  even  in  the  very  simplest  situation  where  the  underlying 
distribution  was  trlvarlate  normal  and  the  null  hypothesis  Hq:  P]^2**^13 
was  true.  These  methods  were  thus  eliminated  from  subsequent  parts 
of  Che  study. 

For  instance,  the  procedure  proposed  by  Davis  and  Quade  (1968) 
uses  Kendall's  tau  as  the  measure  of  correlation  and  a  U-atatistlcs 
approach  to  the  hypothesis  testing  problem.  However,  the  initial  runs 
indicated  chat  the  empirical  power  was  dominated  by  the  Choi  procedure. 
This  combined  with  the  additional  fact  Chat  the  U-statlatic  approach 
is  more  complicated  computationally  than  the  procedures  using  ranks, 
led  to  the  procedure  being  dropped  from  the  study. 
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THE  SDHJLATION  STUDY 


An  extensive  simulation  study  was  run  to  compare  the  test 
statistics  listed  below.  For  each  parameter  configuration  and 
distribution  assumption  1000  samples  of  size  10  and  1000  samples 
of  size  25  were  generated  and  the  appropriate  one-talled  test  per¬ 
formed  at  a  nominal  level  of  .05. 

Using  the  IMSL  subroutine  G6NSM,  the  first  samples  were 
generated  with  ^3  trlvariate  normal  distri¬ 

bution  with  variance-covariance  matrix 


Note  that  the  parameter  ^22  ^  nuisance  parameter  which  must  be 

handled.  Under  the  study  used  53  parameter  configurations  which 
adequately  cover  all  the  possibilities  for  ^22  *  *^13 

give  a  positive  definite  covariance  matrix.  Under  the  alternative 
hypothesis,  52  different  configurations,  limited  to  the  cases  where 
both  p^2  positive,  were  used.  If  the  signs 

of  P22  P23  nm  known,  as  Is  often  the  case  In  practical  situa¬ 

tions,  an  appropriate  change  of  sign  on  one  of  the  variables  can 
always  be  made  so  that  p^2  ^13  positive.  Some  of  the  early 

rims  Included  configurations  where  p^^  or  both  parameters  were 
negative,  but  all  the  results  were  strictly  consistent  with  the 
case  where  both  parameters  are  positive.  So  those  situations  were 
not  used  In  subsequent  runs. 


V 


Additional  samples  were  generated  under  a  trlvarlate  log¬ 
normal  distribution.  Each  observation  was  obtained  by  generating 
a  trlvarlate  normal  observation  and  making  the  trans¬ 

formation  ■  exp(Z^),  1  *  1,2,3.  In  order  to  obtain  the  co- 
variance  matrix  £'  for  (X^,X2,X^),  the  generating  trlvarlate 
normal  distribution  has  a  covariance  matrix  with  elements 

-  log(p^j(e-l)  +  1]  , 

where  the  p^^  are  the  desired  elements  of  I'.  This  trlvarlate 
lognormal  distribution  not  only  has  the  advantage  of  being  easy 
to  generate,  but  It  also  has  marginal  distributions  that  are 
quite  nonnormal. 

There  are  fewer  parameter  configurations  which  give  a 
positive  definite  covariance  matrix  for  both  this  lognormal  and 
the  generating  trlvarlate  normal  distribution,  however.  In  the 
present  study  30  such  configurations  which  correspond  to  are 
reported,  and  45  which  fall  in  the  region  of  the  alternative 
was  studied. 

Five  test  statistics  were  evaluated  In  the  full  study 
(although,  as  mentioned  previously,  some  early  parts  of  the  study 
Included  others).  The  five,  with  the  abbreviations  used  In  the 
tables  of  results  are: 

(W)  Williams'  test,  as  applied  to  the  raw  data.  This  Is 
the  benclmark,  at  least  as  far  as  the  normal  distribution 
Is  concerned,  although  Its  behavior  under  nonnormal  cir¬ 
cumstances  had  not  been  studied. 


(C)  The  test  proposed  by  Choi  (1977).  This  requires 
replacing  X2j^,  X^^  by  their  respective  ranks,  R(X2^) 


Tables  1  and  2  present  the  results  of  the  simulation  study 
under  the  trlvarlate  normality  assumption  and  at  parameter  con¬ 
figurations  consistent  with  p2^2*^13  samples  of  size  10 
and  25,  respectively.  The  .05  level  used  here  would  imply  that 
the  particular  test  being  considered  ought  to  reject  Hq  approxi¬ 
mately  50  times,  at  any  of  these  null  parameter  values. 

The  most  readily  apparent  observations  from  the  tables  are 
that,  as  expected  Williams’  test  very  consistently  rejects 
about  5Z  of  the  time,  and  that  both  C  and  NS  tend  to  be  extremely 
conservative  when  the  magnitude  of  Pj^2  Pl3  Isrg®  (in  fact 
when  P2^2”^13  "  *9, neither  test  rejected  any  of  1000  samples  of  size 
25  and  together  they  rejected  only  3  times  for  samples  of  size  10) . 
It  is  clear,  in  fact,  that  these  two  tests,  which  are  based  on 
rank  correlation  and  thus  might  be  expected  to  be  distribution 
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free,  are  not  even  parameter  free.  Note  also  that  the  two 
procedures  which  replaced  the  data  by  scores  (either  ranks  or  normal 
scores)  and  then  used  Williams’  procedure  behaved  well.  WR  rejected 
29  and  91  times  In  the  two  most  extreme  cases  while  WNS  rejected  32 
and  88  times  In  Its  most  extreme  cases.  Although  neither  Is  as 
stable  as  Williams'  test  (as  expected),  likewise  neither  suffered 
any  serious  difficulties  in  maintaining  something  close  to  the 
nominal  level  for  the  test. 

The  power  study  at  the  normal  distribution  tends  to  confirm 
the  suppositions  in  the  preceding  paragraphs.  Since  p^2  ^  ^^^3 
causes  the  parameter  space  to  be  three-dimensional,  the  entire 
study  does  not  lend  itself  readily  to  tables.  However,  we 
illustrate  Che  points  with  a  few  sequences  of  parameters  chosen 
from  the  study  with  and  p^^  fixed  and  moving  away  from  pj^^, 
a  case  in  which  we  expect  to  see  increasing  power.  Three  such 
examples  appear  as  Table  3.  In  each  case  we  see  that  C  and  NS 
have  considerably  less  power  than  the  competing  procedures.  (In 
at  least  one  case  that  observation  must  be  tempered  by  noting  that 
C  and  NS  fell  significantly  below  the  nominal  level  at  the  null 
hypothesis,  and  thus  might  be  expected  to  fall  short  in  the  power 
comparisons  at  nearby  parameter  configurations  as  well.)  We  £^.so 
note  again  that  while  WNS  and  WR  do  not  achieve  the  same  power  as 
Williams'  test,  they  do  not  fall  disastrously  short  of  the  desired 
performance . 

Tables  4  and  5  illustrate  the  performance  of  the  same  test 
statistics  at  the  null  hypothesis  when  the  trlvarlate  distribution 


has  lognormal  marginals.  Several  ImporCanc  observations  need  to 
be  made  here.  First,  as  before  C  and  NS  do  not  maintain  the  desired 
.05  level.  Again  the  parameter  configuration  where  they  had 
the  most  difficulty  were  those  which  had  the  greatest  magnitude 
for  pj^2  ^13’  before,  they  tend  to  be  extremely  conserva¬ 

tive  at  those  values. 

Second,  as  might  have  been  suspected,  the  behavior  of  Williams' 
test  breaks  down  for  this  highly  skewed  distribution.  For  sample 
size  10,  the  observed  significance  level  varies  from  .007  to  .149 
with  20  of  the  30  parameter  configurations  giving  values  outside 
the  Interval  .037  to  .063  (which  is  .050  +  2  standard  errors)  and 
for  samples  of  size  25,  the  observed  significance  level  varies  from 
.002  to  .214  with  24  of  the  30  parameter  configurations  giving 
values  outside  the  .037,  .063  interval.  On  the  other  hand,  the 
tests  that  replace  the  data  by  scores  and  then  apply  Williams' 
procedure  fared  much  better.  WNS  was  outside  the  interval  .037 
to  .063  only  3  of  30  times  for  samples  of  size  10  and  2  of  30  times 
for  samples  of  size  25,  while  the  figures  are  6  of  30  atn  »  10 
and  5  of  30  at  n  «•  25  for  the  WR  procedure. 

In  Table  6,  sample  sequences  of  parameter  configurations  which 
move  away  from  are  again  considered.  It  should  be  noted  here 
that  the  C  and  NS  procedures  have  lower  power  than  the  other  proce¬ 
dures  In  general.  It  should  also  be  noted  that  In  the  last  example 
the  procedures  using  the  score  functions  surpass  Williams'  test 
In  terms  of  power  as  the  p^2  become  more  separated,  even 

though  Williams'  procedure  had  a  large  observed  significance  level 
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The  results  here  are  typical  of  those  of  the  whole  study  in 
that,  in  most  cases,  the  WR  and  WNS  procedures  were  competitive 
with  W,  never  having  an  inordinately  smaller  power.  In  fact, 
in  some  cases  where  W  has  more  power,  it  appears  attributable 
to  the  fact  that  the  true  level  of  W  is  not  very  stable  for 
this  distribution. 

RECOMMENDATIONS 

In  situations  where  a  practitioner  is  comfortable  with 
the  assumption  of  trlvariate  normality,  it  is  recommended  that 
Williams'  test  be  used.  This  is  consistent  with  Neill  and 
Dunn  (1975).  On  the  other  hand,  when  normality  is  not  a  good 
assumption  it  is  reconmended  that  WR  or  WNS  be  used,  as  their 
behavior  is  much  more  stable  than  Williams'  test,  and  competitive 
in  terms  of  power.  Between  the  two  tests,  the  choice  might  be 
difficult.  Using  the  power  study,  WNS  appears  to  be  slightly 
better.  On  the  other  hand,  use  of  the  ranks  does  not  require 
special  tables  and  it  appears  that  the  computation,  particularly 
if  it  is  to  be  done  by  hand,  might  be  sufficient  to  recommend  the 
WR  procedure. 

In  retrospect,  one  notices  that  replacing  the  data  by  ranks 
and  then  applying  the  usual  normal  theory  techniques  to  make  the 
appropriate  inference  is  an  idea  that  Iman  and  Conover  (see  Iman 
(1974)  or  Iman  and  Conover  (1979))  have  espoused  in  a  number  of 
other  statistical  settings. 
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